# Low-resource language processing

Mbart50 Saraiki News Summarization
MIT
A Seraiki news summarization model fine-tuned based on the mBART-50 multilingual model, capable of generating concise summaries from Seraiki news content
Text Generation Transformers Other
M
SaraikiAI
22
0
Aidman Wav2vec2 Large Xls R 300m Irish Colab
Apache-2.0
This is a speech recognition model fine-tuned on the Common Voice dataset based on facebook/wav2vec2-xls-r-300m, supporting Irish language.
Speech Recognition Transformers
A
AIDman
110
0
Unt 8b
Apache-2.0
The Camel Model is a text generation model based on the transformer architecture, supporting Azerbaijani and trained using reinforcement learning.
Large Language Model Transformers Other
U
omar07ibrahim
33
2
Whisper Fleurs Small Te In
Apache-2.0
This model is a fine-tuned version of OpenAI's Whisper Small on the FLEURS dataset, focusing on speech recognition tasks and supporting Telugu (te).
Speech Recognition Transformers Other
W
jonahdvt
14
1
Mt5 Sinhala News Finetunedv3
A text summarization model fine-tuned on Sinhala news data based on Google's mT5-small model
Text Generation Transformers Other
M
kbrinsly7
159
0
Roberta Uz
MIT
XLM-RoBERTa-large fine-tuned Uzbek named entity recognition model supporting 21 entity types
Sequence Labeling Transformers Other
R
mustafoyev202
21
1
Whisper Base Pl
Apache-2.0
A speech recognition model fine-tuned on the Polish Common Voice 17.0 dataset based on OpenAI Whisper-base
Speech Recognition Transformers Other
W
marcsixtysix
27
1
Shark Finetuned Kde4 Ar En
Apache-2.0
Arabic-to-English translation model fine-tuned on the kde4 dataset based on Helsinki-NLP/opus-mt-ar-en
Machine Translation Transformers
S
ahmedshark
22
1
Romaneng2nep V3
Apache-2.0
This model is fine-tuned from google/mt5-small for converting Romanized Nepali to Nepali text
Machine Translation Transformers Supports Multiple Languages
R
syubraj
29
1
Mms Tts Div Finetuned Md F02
This is a Transformer-based speech model supporting Dhivehi (Maldivian) speech processing tasks.
Large Language Model Transformers Other
M
alakxender
28
0
Mt5 XLSUM Ua News
A headline generation model fine-tuned on Ukrainian news datasets based on the multilingual mT5 model, capable of generating concise and accurate headlines for Ukrainian news articles.
Text Generation Transformers Other
M
yelyah
110
1
Whisper Sinhala Audio To Text
Apache-2.0
A Sinhala speech recognition model fine-tuned based on openai/whisper-small, supporting conversion of Sinhala speech to text.
Speech Recognition Transformers
W
AqeelShafy7
229
2
Whisper Small Kyrgyz
Kyrgyz automatic speech recognition (ASR) model based on the Whisper architecture, developed with support from the National Commission on Language and Language Policy under the President of the Kyrgyz Republic
Speech Recognition Transformers Other
W
UlutSoftLLC
841
4
Kubert Central Kurdish BERT Model
KuBERT is a Central Kurdish language model based on the BERT framework, designed to address the scarcity of Kurdish language resources and enhance computational linguistics capabilities.
Large Language Model Transformers
K
asosoft
128.71k
5
Mt5 Small Amharic Text Summaization
Apache-2.0
A fine-tuned Amharic text summarization model based on google/mt5-small, suitable for news article headline generation tasks.
Text Generation Transformers
M
yohannesahunm
61
0
Mmlw Roberta Base
Apache-2.0
A Polish sentence embedding model based on RoBERTa architecture, focusing on sentence similarity calculation and feature extraction tasks.
Text Embedding Transformers Other
M
sdadas
106.30k
3
Nllb Clip Base Siglip
NLLB-CLIP-SigLIP is a multilingual vision-language model that combines the text encoder from NLLB and the image encoder from SigLIP, supporting 201 languages.
Text-to-Image
N
visheratin
478
1
M2m100 1.2B Ft Ru Kbd 63K
MIT
A translation model fine-tuned on Russian-Kabardian datasets based on facebook/m2m100_1.2B
Machine Translation Transformers Other
M
anzorq
39
1
Sinhala Roberta Sentence Transformer
This is a sentence-transformers based model for mapping Sinhala sentences into a 768-dimensional vector space, supporting tasks like sentence similarity calculation and semantic search.
Text Embedding Transformers
S
Ransaka
16
0
MLEAFIT Es2ptt5
Apache-2.0
This is a Spanish-to-Portuguese translation model fine-tuned based on the T5-small architecture, trained on the tatoeba dataset, with an evaluated BLEU score of 11.2994.
Machine Translation Transformers
M
jdmartinev
38
1
Bodo Roberta Base
MIT
This is a Bodo language configuration model based on the RoBERTa architecture, including a byte-level BPE tokenizer for Bodo and RoBERTa base configuration.
Large Language Model Transformers
B
alayaran
26
1
Whisper Small Haitian
Apache-2.0
This model is a fine-tuned version of whisper-small-cv11-french, optimized for Haitian Creole speech recognition
Speech Recognition Transformers
W
YassineKader
18
2
Bert Restore Punctuation Turkish
MIT
This is a Transformer model for Turkish text punctuation restoration, capable of predicting the correct positions of periods (.), commas (,), and question marks (?).
Sequence Labeling Transformers Other
B
uygarkurt
55
6
Glot500 Base
Apache-2.0
Glot500 is a multilingual pre-trained model that supports over 500 languages and is trained based on the masked language modeling (MLM) objective.
Large Language Model Transformers
G
cis-lmu
1,990
19
Tags Allnli GroNLP Bert Base Dutch Cased
Dutch BERT-based sentence embedding model that maps text to a 768-dimensional vector space, suitable for semantic similarity calculation and text classification tasks
Text Embedding Transformers Other
T
textgain
1,067
4
Mt5 Small HunSum 1
Hungarian abstractive summarization model trained on the mT5-small architecture using the HunSum-1 dataset
Text Generation Transformers Other
M
SZTAKI-HLT
14
1
Whisper Small Yoruba
Apache-2.0
This model is a fine-tuned version of openai/whisper-small on the google/fleurs yo_ng dataset, designed for automatic speech recognition tasks in Yoruba.
Speech Recognition Transformers
W
steja
16
4
Whisper Small Sk Cv11
Apache-2.0
Slovak speech recognition model fine-tuned on OpenAI Whisper-small, trained on the Common Voice 11.0 Slovak dataset
Speech Recognition Transformers Other
W
mikr
79
2
Slovakbert Skquad
MIT
This model is a Q&A model fine-tuned on the Slovak language dataset skquad based on SlovakBERT
Question Answering System Transformers Other
S
TUKE-DeutscheTelekom
17
3
XLMR BERTovski
A language model pretrained on large-scale Bulgarian and Macedonian texts, part of the MaCoCu project
Large Language Model Other
X
MaCoCu
36
0
Estroberta
Estonian feature extraction model fine-tuned based on XLM-RoBERTa base model
Text Embedding Transformers Other
E
tartuNLP
1,842
0
Nllb 200 1.3B
A multilingual processing model supporting over 100 languages and writing systems, covering major global language families and dialect variants
Large Language Model Transformers Supports Multiple Languages
N
facebook
14.03k
57
Marian Finetuned Kde4 En To Ar
Apache-2.0
This model is an English-to-Arabic translation model fine-tuned on the kde4 dataset based on Helsinki-NLP/opus-mt-en-ar.
Machine Translation Transformers
M
anibahug
19
1
Mbart Finetuned Fa
A generative summarization model fine-tuned on Persian summarization datasets based on MBART-large-50
Text Generation Transformers Other
M
eslamxm
49
1
Mt5 Base Finetuned Fa
Apache-2.0
A Persian summarization model fine-tuned on pn_summary dataset based on google/mt5-base
Text Generation Transformers Other
M
ahmeddbahaa
20
1
Mt5 Multilingual XLSum Finetuned Fa Finetuned Ar
A multilingual summarization model based on mT5, specifically fine-tuned for Arabic on the XLSum dataset
Text Generation Transformers Arabic
M
ahmeddbahaa
13
1
Mt5 Base Finetuned Urdu
Apache-2.0
This model is a fine-tuned summarization model based on google/mt5-base on the Urdu xlsum dataset
Text Generation Transformers Other
M
eslamxm
49
0
Wav2vec2 Large Xls R 300m Urdu Cv8 200epochs
Urdu speech recognition model trained on Common Voice dataset, using wav2vec 2.0 architecture
Speech Recognition Transformers
W
omar47
20
0
Arabart Finetuned Ar
Apache-2.0
A text summarization model fine-tuned on the Arabic summarization dataset xlsum based on the AraBART model
Text Generation Transformers
A
ahmeddbahaa
40
0
Fullstop Catalan Punctuation Prediction
This model is used to predict punctuation marks in Catalan, capable of restoring periods, commas, question marks, hyphens, and colons.
Sequence Labeling Transformers Other
F
softcatala
16
1
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase